Skip to content

Comments

Fix unreadable Shift-JIS encoded QR Code#173

Closed
askdkc wants to merge 1 commit intoBacon:mainfrom
askdkc:patch-1
Closed

Fix unreadable Shift-JIS encoded QR Code#173
askdkc wants to merge 1 commit intoBacon:mainfrom
askdkc:patch-1

Conversation

@askdkc
Copy link
Contributor

@askdkc askdkc commented Mar 21, 2024

This PR is fix for the issue #172

@codecov
Copy link

codecov bot commented Mar 21, 2024

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 64.93%. Comparing base (c01758c) to head (6e015af).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
src/Encoder/Encoder.php 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##               main     #173   +/-   ##
=========================================
  Coverage     64.93%   64.93%           
+ Complexity      981      980    -1     
=========================================
  Files            48       48           
  Lines          3094     3094           
=========================================
  Hits           2009     2009           
  Misses         1085     1085           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@DASPRiD
Copy link
Member

DASPRiD commented Mar 21, 2024

I'm not a big fan of this – the kanji mode exists specifically to massively reduce the amount of storage required for that.

Looking at the original zxing code I noticed that they subtract 1 from the length to make it uneven, which we don't do.

Could you test locally if that makes a difference for your result?

@DASPRiD
Copy link
Member

DASPRiD commented Mar 21, 2024

I also tried to generate the same QR code with ZXing, which yielded a valid QR code, so there must be some bug in Bacon, but I can't tell, which.

QR Code

@askdkc
Copy link
Contributor Author

askdkc commented Mar 21, 2024

@DASPRiD

I tested your suggestion but unfortunately the result is still unreadable QR code.
qrcode

@askdkc
Copy link
Contributor Author

askdkc commented Mar 21, 2024

I also tried to generate the same QR code with ZXing, which yielded a valid QR code, so there must be some bug in Bacon, but I can't tell, which.

@DASPRiD

Could you test with "あいうえお" not "あいうえお123"?

@DASPRiD
Copy link
Member

DASPRiD commented Mar 21, 2024

You can open the link yourself :)

@askdkc
Copy link
Contributor Author

askdkc commented Mar 21, 2024

You can open the link yourself :)

My bad😅 (I thought you are testing it in locally.)

It works with "あいうえお" in ZXing. So yeah, I think something wrong with Bacon but I'm not sure what is.

@DASPRiD
Copy link
Member

DASPRiD commented Mar 21, 2024

I'd recommend to poke around and compare the code paths for kanji between Bacon and ZXing, it might be some very small knob.

@DASPRiD DASPRiD marked this pull request as draft March 22, 2024 19:50
@vlakoff
Copy link
Contributor

vlakoff commented Nov 11, 2025

I made an attempt to find the mistake, but haven't succeeded so far.

That bytes.length - 1 in the zxing code is actually useless:

    int maxI = bytes.length - 1; // bytes.length must be even
    for (int i = 0; i < maxI; i += 2) {

Since we're sure that bytes.length is even and the iterator increments by 2, this loop would perform the same number of iterations as:

    int maxI = bytes.length;
    for (int i = 0; i < maxI; i += 2) {
  • Back to the PHP implementation, both iconv() and mb_convert_encoding() provide the expected pairs of bytes:
$bytes = iconv('utf-8', 'SHIFT-JIS', 'あいうえお');
print_r(unpack('C*', $bytes));

$bytes = mb_convert_encoding('あいうえお', 'SHIFT-JIS');
print_r(unpack('C*', $bytes));

Both result in the same expected output:

Array
(
    [1] => 130
    [2] => 160
    [3] => 130
    [4] => 162
    [5] => 130
    [6] => 164
    [7] => 130
    [8] => 166
    [9] => 130
    [10] => 168
)
  • So far, I haven't detected errors in the rest of the code, including appendBits() and appendBit().

@vlakoff
Copy link
Contributor

vlakoff commented Nov 11, 2025

I managed to get a working QR code by replacing strlen() with mb_strlen() in the following line:

$numLetters = (Mode::BYTE() === $mode ? $dataBits->getSizeInBytes() : strlen($content));

This definitely warrants deeper analysis and understanding, but this is the culprit.

@DASPRiD
Copy link
Member

DASPRiD commented Nov 11, 2025

Oh, great find, that does actually make a lot of sense, since the variable literally indicates number of letters, not bytes! Though we are still using iconv in this library, as it's more widely installed, so iconv_strlen would probably be correct here.

vlakoff added a commit to vlakoff/BaconQrCode that referenced this pull request Nov 11, 2025
Fixes unreadable QR codes in Kanji mode with Shift-JIS encoding (see Bacon#172).

PR Bacon#173 worked around the issue by forcing Byte mode, but that lost Kanji mode’s efficiency.

The actual cause was using strlen() to count characters.

Replacing it with iconv_strlen($content, 'utf-8') ensures correct character count and restores proper Kanji encoding.
@DASPRiD
Copy link
Member

DASPRiD commented Nov 15, 2025

Superseded by #200

@DASPRiD DASPRiD closed this Nov 15, 2025
@askdkc askdkc deleted the patch-1 branch November 15, 2025 22:03
vlakoff added a commit to vlakoff/BaconQrCode that referenced this pull request Nov 16, 2025
Fixes unreadable QR codes in Kanji mode with Shift-JIS encoding (see Bacon#172).

PR Bacon#173 worked around the issue by forcing Byte mode, but that lost Kanji mode’s efficiency.

The actual cause was using strlen() to count characters.

Replacing it with iconv_strlen($content, 'utf-8') ensures correct character count and restores proper Kanji encoding.
DASPRiD pushed a commit that referenced this pull request Nov 16, 2025
* Fix Kanji mode QR code generation

Fixes unreadable QR codes in Kanji mode with Shift-JIS encoding (see #172).

PR #173 worked around the issue by forcing Byte mode, but that lost Kanji mode’s efficiency.

The actual cause was using strlen() to count characters.

Replacing it with iconv_strlen($content, 'utf-8') ensures correct character count and restores proper Kanji encoding.

* Test: Cover exclusive vs. mixed Shift-JIS data for Kanji mode

* Use strlen() for numeric and alphanumeric modes to avoid iconv overhead

For NUMERIC and ALPHANUMERIC modes, strlen() is sufficient since these modes only
operate on single-byte characters. strlen() runs in O(1) time and avoids the
overhead of iconv_strlen().

* Rewrite appendBytes() using a PHP match

Nicer, and more consistent with some other code in the encode() method.

Invalid modes are really not expected here, as this private method is called only by encode(),
and the mode is determined right at the beginning of encode() by calling chooseMode().

Though, if an invalid mode ever comes here, the match would throw an UnhandledMatchError.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants